perm filename 0[0,BGB] blob
sn#101480 filedate 1974-05-13 generic text, type C, neo UTF8
COMMENT ā VALID 00003 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002
C00015 00003
C00020 ENDMK
Cā;
The algorithm consists of five steps: digital image
thresholding, binary image contouring, polygon nesting, polygon
smoothing, and polygon comparing.
The acronym CRE stands for "Contour, Region, Edge" and
for "Cart's Eye". CRE is a solution to the problem of finding
contour edges in a set of television pictures and of linking
corresponding edges from one picture to the next. The process is
automatic and is intended to run without human intervention.
Furthermore, the process is bottom up; there are no significant
inputs other than the given television images. The output of CRE is
a 2D contour map data structure which is suitable input to a 3D
geometric modeling program.
The overall design goal for CRE was to build a region edge
finding program that could be applied to a sequence of television
pictures and that would output a sequence of line drawings without
having to know anything about the content of the images. Furthermore
it was desired that the line drawings be structured. The six design
choices that determined the character of CRE are:
1. Dumb vision rather than model driven vision.
2. Multi image analysis rather than single image analysis.
3. Total image structure imposed on edge finding; rather
than separate edge finder and image analyzer.
4. Automatic rather than interactive.
5. Fixed image window size rather than variable window size.
6. Machine language rather than higher level language.
The design choices are ordered from the more strategic to
the more tactical; the first three choices being research
strategies, the latter three choices being programming tactics.
Adopting these design choices lead to image contouring and contour
map structures similar to that of Krakauer[3] and Zahn[4].
The first design choice does not refer to the issue of how
model dependent a finished general vision system will be (it will be
quite model dependent), but rather to the issue of how one should
begin building such a system. I believe that the best starting
points are at the two apparent extremes of nearly total knowledge of
a particular visual world or nearly total ignorance. The first
extreme involves synthesis (by computer graphics) of a predicted 2D
image, followed by comparing the predicted and a perceived image for
slight differences which are expected but not yet measured. The
second extreme involves anaylsing perceived images into structures
which can be readily compared for near equality and measured for
slight differences; followed by the construction of a 3D geometric
model of the perceived world. The point is that in both cases images
are compared, and in both cases the 3D model initially (or finally)
contains specific numerical data on the geometry and physics of the
particular world being looked at.
~I1973,835;~F8- 2 -~
The second design choice, of multi image anaylsis rather
than single image analysis, provides a basis for solving for camera
positions and feature depths. The third design choice solves (or
rather avoids) the problem of integrating an edge finder's results
into an image. By using a very simple edge finder, and by accepting
all the edge found, the image structure is never lost. This design
postpones the problem of interpreting photometric edges as physical
edges.
The fourth choice is a resolution not to write an image
processor that requires operator assistance and parameter tuning.
The fifth choice of the 216 by 288 fixed window size is a sin that
proved surprisingly expedient, it is explained later. A variable
window version of CRE at halves, thirds and other simple fractions
of its present window size will be made at some future date.
The final design choice of using machine language was for
the sake of implementing node link data structures that are
processed 100 faster than LEAP, 10 times faster than compiled LISP
and that require significantly less memory than similar structures
in either LISP or LEAP. Furthermore machine code assembles and loads
faster than higher level languages; and machine code can be
extensively fixed and altered without recompiling.
It is my impression that CRE does not raise any new
scientific problems; nor does it have any really new solutions to
the old problems; rather CRE is another competent video region edge
finding program with its own set of tricks. However, it is further
my impression that the particular tricks for smoothing, nesting and
comparing polygons in CRE are original as programming techniques.